home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Technotools
/
Technotools (Chestnut CD-ROM)(1993).ISO
/
lang_c
/
mscheap2
/
heapinfo.txt
< prev
next >
Wrap
Text File
|
1990-03-13
|
54KB
|
1,261 lines
Replacement Heap Manager for MSC
--------------------------------
Copyright (c) 1990 by Optimal Software, All Rights Reserved
Optimal Software, 4 Lacy Lane, Nashua, NH 03061-2151
603-880-9844 CIS [73710,406]
This document describes a replacement heap manager for MSC 5.1.
The library is binary compatible with the original MSC functions
so recompilation is not necessary. All of the new functions are
exact replacements for the original MSC "far" heap functions.
I. General
The replacement heap manager differs from the original MSC heap
manager in a number of ways. The following sections address each
of the differences according to the following categories:
- Section II : functionality and performance
- Section III : flexibility and user control
- Sectiom IV : error handling and debugging
- Section V : compatibility with MSC run-time library
- Section VI : version notes
- Section VII : licensing and distribution
There is also a difference in the approach to the design of the
software. The design of the MSC heap manager seems to be based on
a philosophy of procrastination. I.e., processing is deferred until
it is absolutely necessary. This approach enhances the performance
of basic operations but degrades the overall performance of the
system. (Penny-wise and pound-foolish).
In contrast, the replacement heap manager is a philosophical
activist. I.e., processing is performed as soon as convenient.
This alternate approach uses more sophisticated basic operations in
an attempt to maintain the system in the best condition possible.
(Penny-foolish but pound-wise).
In practice the MSC heap manager may avoid some work that the
replacement heap manager undertakes. On the other hand, the
MSC heap manager is often confronted with pathological situations
of its own construction. The replacement heap manager attempts to
avoid such unpleasantness.
In fact the objectives of the two designs are more easily decribed
by what they avoid than what they pursue. The MSC heap manager
avoids work while the replacement heap manager avoids failure.
If you have a heap-intensive application that does not use much
memory and maximum performance is worth an occasional crash
"out of heap memory" stick with the original MSC heap manager.
It is heavily optimized toward small applications. If you need
to manage large heaps, or are tight on memory, or need reliable
performance, then the replacement heap manager may be useful.
II. Functionality and performance
These differences are fundamental to the value of the replacement
heap manager. They define how well the heap manager does its
primary job: managing the heap.
A. Efficient use of memory
The replacement heap manager does not inadvertently fragment the
heap in as many ways as the MSC heap manager. It allows the
user program to define the direction in which the heap should
be optimized.
In order to expedite allocation operations there is a global
list of free chunks. The list is doubly linked, and ordered
as defined by _heapmode. MSC does not appear to have a free
list. It seems to rely on the hope that garbage collection is
infrequent to compensate for the fact than searching the entire
heap for free entries is extremely expensive.
The replacement heap manager optimizes its operations based on
the value of the variable _heapmode.
- The _HEAPTIME setting sacrifices space efficiency for
performance. The free list is not maintained in order
and allocation takes the first entry that fits (LIFO).
This setting is useful for programs with small amounts
of volatile data.
This setting also tries to meet allocation needs without
merging free blocks. If the allocation fails the free
list is condensed and another attempt is made. The MSC
heap manager operates in a similar manner, fragmenting
the heap and wasting space in an attempt to save time.
- If _heapmode is _HEAPSIZE the heap manager minimizes the space
occupied by the heap by maintaining the free list in address
order (first fit). This technique tends to keep the in-use
heap entries in low memory so that _heappack() can release
unused space back to DOS. This setting is useful for programs
which spawn other programs, or use large buffers, or are
generally tight on memory.
- If _heapmode is _HEAPSPACE the heap manager minimizes the
wasted heap space by maintaining the free list in order of
size (best fit). This technique permits the most effective
use of the available heap space. It is useful for programs
that must handle overflow via disk files. The efficient
utilization of heap memory minimizes the need for disk access.
Note that changing the _heapmode on the fly does not
automatically re-order the free list. Neither does it cause
problems. As the heap is used the free list will lose the
old ordering and conform to the new optimization.
The default mode is described by _HEAPMODE and may be set
from the compiler command line when the library is built by
defining the symbol _HEAPMODE. E.g., -D_HEAPMODE=_HEAPSIZE.
The factory setting is _HEAPTIME as this is the closest
approximation to the original MSC heap manager.
B. Default data segment automatically minimized at startup.
The link switch /cp does not affect exec/spawned programs.
They always get a 64Kb default data segment no matter how
they are built or adjusted with exemod. The replacement
heap manager handles this problem automatically. In programs
build with "far" data ("compact", "large", and "huge" models)
the "near" heap is shrunk to the minimum and the rest of the
memory is returned to DOS.
Credit for this goes to Compuserve/MSSYS/DL3/alloc.arc/reducedd.c
by Willard Gersbacher [76117,2611]. It is a neat and clean
solution to the lack of cooperation between MS-DOS and MS-C.
C. Rational handling of NULL pointers and requests for zero bytes
There is confusion regarding requests for zero bytes and
handling of NULL pointers. The following discussion attempts
to clarify the documented behavior of the MSC routines, the
actual behavior of the MSC routines, and the behavior of the
replacement routines.
1. _Msize()
The MSC documentation does not address the result of
_msize(NULL). The actual result is some kind of core
constant.
The replacement version of _msize( NULL ) returns zero.
2. Malloc()
The MSC documentation states that malloc(0) returns NULL.
This is not true. Malloc(0) returns a pointer to a
zero-length entry.
The replacement is exactly the same.
3. Calloc()
The MSC documentation states that calloc(0,n) and calloc(n,0)
return NULL. This is not true. Calloc(0,0) returns a pointer
to a zero-length entry.
The replacement is exactly the same.
4. _Expand()
The MSC documentation does not address the behavior of
_expand() when called with a NULL pointer or a zero length.
The actual behavior is detailed below.
_expand( NULL, 0 ) returns NULL
_expand( NULL, 100 ) returns NULL
_expand( !NULL, 0 ) returns a pointer to 0 bytes
_expand( !NULL, 100 ) returns a pointer to 100 bytes
The replacement is exactly the same.
5. Realloc()
The MSC documentation specifies that realloc( NULL, size )
functions like malloc( size ) and that realloc( pointer, 0 )
returns NULL. This is correct except for the case covered
by both rules (they conflict). The actual behavior of
realloc( NULL, 0 ) is to return a pointer to zero bytes as
shown below.
realloc( NULL, 0 ) returns a pointer to 0 bytes
realloc( NULL, 100 ) returns a pointer to 100 bytes
realloc( !NULL, 0 ) returns NULL
realloc( !NULL, 100 ) returns a pointer to 100 bytes
Because the conflict between the realloc() rules creates
an inconsistency and because of the problems with the
malloc() documentation vs behavior the replacement for
realloc() is rational rather than exact. The difference is
that realloc( pointer, 0 ) returns a pointer to zero bytes
instead of NULL.
6. The file stdmsc.c is an easy way to verify the accuracy of
the information provided here.
D. DOS interface
The MSC heap manager's interface to DOS automatically rounds
_amblksiz up to the next power of two when extending the heap
memory pool. This action encourages fragmentation of the heap
because user requests for memory are not usually oriented around
the powers of two so there are gaps left in the heap when the
heap is extended.
The replacement heap manager does not force power-of-two sizing
on the DOS interface. It rounds _amblksiz up to the next paragraph
only. This approach gives the application program precise control
over the heap.
E. Consistent "far" and "huge" heap handling
The Intel architecture, segmented and non-relocatable, is indeed
"brain-dead", but this has been obvious for decades. The modern
problem has more to do with software that amplifies the deficiency
rather than dealing with it.
1. The MSC heap manager cannot handle "huge" heap entries.
Instead the application program must use the DOS memory
manager via halloc() for regions that exceed a segment
(64Kb). This requirement imposes three negative conditions
on application programs.
a. DOS is slow.
The DOS memory manager has a lot of overhead. While
large blocks of memory are usually not volatile, there is
no reason to impose an unnecessary performance penalty.
b. Lack of functionality
The MSC heap manager includes only halloc() and hfree().
There is no hrealloc(), nor is there a query function
like _msize() to determine the size of an allocated region.
c. No safe way to deal with both types of heap memory
The distinction between "far" and "huge" heap pointers
is an onerous one. It requires application programs
to decide, depending on the size of the request, which
function to use to obtain the memory block. Then the
application must record the selection in order to properly
dispose of the memory. Free() will not handle a huge
heap pointer, and hfree() will not handle a "far" heap
pointer.
2. The replacement heap manager deals with these problems
by providing a single handler for both address models.
There is no need to segregate the "far" and "huge" heap
entries. In addition to simplifying things for the user,
this also cuts down on the heap fragmentation and allows
complete optimization of heap access via _heapmode.
Unfortunately, maintaining compatibility with the MSC heap
manager implies that the limitations of that design continue
to influence the use of the heap. E.g., in order to precisly
mimic the MSC heap manager, calloc() will fail to allocate
an area larger than 64Kb. For that you need _hcalloc().
This is irritating, but not crippling because _hcalloc() is
a proper superset of calloc(). You can use it anywhere that
you can use calloc(). Similarly, _hmsize() is valid anywhere
that _msize() is valid. (However, these substitutions imply
data conversions of size_t to long).
F. Benchmarks
Like all benchmarks, testing a heap manager empirically is an
invitation to distortion, but it is worth mentioning. Depending
on the pattern of use the MSC heap manager is either twice as
fast as the replacement or 100 times slower. Two orders of
magnitude is a lot of variability!
1. Speed
The above comparison uses the replacement heap manager as
the reference point because its performance is predictable.
The MSC heap manager is far more sensitive to the pattern
of access. Two extreme cases serve to illustrate this fact.
Given a performance test based on malloc(constant) where
the total heap memory pool is much smaller than the size
of available memory the MSC heap manager is quite fast.
The replacement heap manager is only half as fast as the
original under these conditions.
Equally extreme is a test based on random allocations,
deallocations, and reallocations of random size, where the
total demand matches the size of available memory. In this
case the MSC heap manager chokes on heap fragmentation and
garbage collection. It is up to 200 times slower than the
first test. The replacement is a slightly slower than it was
in the first test.
The extreme tests are indicative of trends, but not definitive.
The file bench.c implements a mixture of 3 parts malloc(),
2 parts free(), and 1 part realloc() as representative of
heap processing. The sizes of the heap entries are skewed
toward the smaller chunks as an example of reasonable heap
usage. The exact distribution is triangular, with a block
of size {N:0..32767} appearing with a probability of
(32768-N) / (32768 * 32769 / 2).
Running the above test with various iteration limits confirms
that the replacement heap manager performs linearly. It takes
twice as long to do twice as much. The MSC performance is not
so predictable. It appears to be a power curve of some sort.
The raw data (in seconds) is as follows:
Iterations MSC New
---------- --- ---
1000 1 0
2000 4 1
4000 8 2
8000 34 4
16000 98 8
32000 231 17
64000 499 33
128000 1034 66
256000 132
512000 265
Incredibly, the above data indicates that the MSC heap
manager is SLOWER than the replacement heap manager. Further,
as the constraints get tighter, the MSC heap manager falls
further and further behind.
Out of curiosity, we ran the same test using a constant
quantum of allocation and obtained the following data:
8 bytes 80 bytes
Iterations MSC New MSC New
---------- --- --- --- ---
1000 0 0 0 0
2000 0 0 0 0
4000 0 0 0 0
8000 0 1 0 1
16000 0 1 1 1
32000 1 3 3 2
64000 4 5 125 11
128000 9 16 395 27
256000 27 48 919 59
512000 5128 112 1999 125
It appears that the MSC heap manager is doing fine until
it has to start garbage collection. At that point it
suffers horribly. Note the sudden transition from about
1/2 minute to about 3 hours!
All of the above tests were run three times, once for
each optimization mode. The size/space optimization modes
were about 75% as fast as the time-optimized mode. The
figures in the tables above reflect the time-optimized
modes as time optimization is the closest match to the
functionality of the MSC heap manager.
2. Memory
The memory usage statistics can be obtained from the output
of _fheapdump(). Depending on the usage pattern and the
optimization mode the memory efficiency varies from 75%
to 98%. The space-optimized mode (best fit) turns out to
be quite efficient. The size-optimized mode (first fit) is
almost as efficient, but it does a reasonable job of limiting
the growth of the heap. The time-optimized mode (LIFO) is
roughly 85% as efficient as the space-optimized mode.
Detailed figures were not compiled for the MSC version, but
inspection of the failure counts in the output of the test
program showed that even the time-optimized replacement
heap manager is considerably better than the MSC heap manager.
This happens because the MSC heap manager wastes more space on
fragmentation than it saves in overhead per block.
3. Real software
As a rough (subjective) test we linked the replacement heap
manager with a medium sized application (40K LOC, 150Kb
heap requirement, 450Kb (overlaid) executable) and found
it noticeably improved. Previously, it had suffered about
15% overhead in the form of fragmented, unusable free space.
With the replacement heap manager the wasted space went to zero.
Literally there were no free blocks left amoung the allocated
blocks, just one at the end of the heap.
Try it on one or two of your programs and see what you get.
You dont have to recompile anything, just link with heap.lib.
III. Flexibility and user control
The features described here allow the application programmer to
control the behavior of the heap. The capability for tuning the
heap performance simplifies the design of application software
because it is not necessary to design around the limitations of
the heap manager.
A. Heap entries can be relocated and the heap shrunk as needed.
The _relocate() and _heappack() functions provide an efficient
method of dealing with multi-phase processing. _Relocate()
serves to reposition entries as low as possible in memory.
Note that your mileage may vary, but _relocate() is guaranteed
to do no harm. I.e., it will only act when improvement is
certain. Thus it is feasible to make multiple passes through
allocated memory (typically until nothing moves) in order to
obtain as much compression as possible. A pass in order of
ascending addresses is guaranteed to obtain the maximum possible
compression in a single pass.
In addition to compressing the heap in order to maximize the
free DOS memory, the relocation process is the basis for a
heap garbage collector. An allocation that fails due to
heap fragmentation may succeed if the heap is compacted.
Applications that cannot afford to crash "out of heap" can
overcome heap fragmentation by relocating the active heap
entries.
_heappack() releases excess heap memory back to DOS.
The file packdemo.c illustrates the basic techniques.
B. Functional extensions to the MSC design.
These capabilities exceed those of the MSC heap manager, so
they contain the seeds of incompatibility. However, their
use is in no way mandated. The replacement heap manager
may be treated exactly as the MSC heap manager.
1. Manifest constants
These describe the constants specific to the replacement
heap manager.
a. SIZE_MAX
This constant describes the maximum size of a heap
entry. SIZE_MAX is to size_t as INT_MAX is to int.
The value is 65535.
b. Heap optimization modes
_HEAPTIME, _HEAPSIZE, and _HEAPSPACE define the
legitimate values for _heapmode. _HEAPTIME produces
LIFO free list handling, and maximizes throughput
at the expense of memory eficiency. _HEAPSIZE produces
first-fit free list handling, and minimizes the overall
size of the heap memory pool. _HEAPSPACE produces
best-fit free list handling, and minimizes the wasted
heap space.
The interpretations are described in detail in section II.A.
2. Global variables:
These variables are declared in heap.h and defined in
heaputil.c. They provide application programs with control
over the behavior of the heap manager.
a. int _Heapmode;
_Heapmode controls the optimization mode of the heap
manager via free list handling algorithms. Section II.A
describes its use in detail.
b. size_t _Heappad;
_Heappad controls the safety margins allocated as part of
every heap entry. The default value is zero, but an
application program may set a non-zero value prior to
using the heap.
Section IV.B gives details about debugging with _heappad.
3. Entry points
These functions extend the power of the heap manager to
cover previously unaddressed issues.
a. void _heapdump( FILE *fp, int verbose );
_heapdump() is a utility function that writes a complete
description of the state of the heap to the indicated
stream file. The report includes the following information:
- Key to notation
- Global variables
- _heapmode
- _heappad
- Arena headers -- DOS interface
- Address [pointer]
- Size
- Forward and backward links [pointers] on arena list
- Chunk headers -- C RTL interface
- Address {<pointer>}
- Size
- Forward and backward links <pointers> on chunk list
- Forward and backward links {pointers} on free list
- Status: ok, read-only, or bad.
- In debug mode additional info is reported:
- Function that last changed the entry
- Desired size of the entry
- Transaction (sequence) number
- Name of source file
- Line number in source file
- Controls for free and join lists
- Address {pointer}
- Number of entries on the list (not including master)
- Forward and backward links {pointers}
- Summary statistics on heap usage
- Pagination: formfeed
Verbose is a boolean argument that controls the level of
detail in the heap status report. In non-verbose mode
the report excludes the chunk info.
b. void _heappack( void );
_Fheappack() releases unused heap memory back to DOS. It
searches the entire heap, shrinking arenas that contain
active entries to the minimum, and totally freeing arenas
that contain no active entries.
Section II.B.1 also describes heap compaction.
c. int _heapwatch( void far *ptr, int enable );
_Fheapwatch() establishes a heap entry as read-only. The
enable argument controls the status of the heap entry.
_Heapchk() tests all the read-only entries in the heap
and reports _HEAPBADPTR if one has changed. Similarly,
the debugging routines will trigger a diagnostic report
if a read-only entry changes.
Section IV.F provides details on the use of _fheapwatch().
d. char *_heapstat( int status );
_Heapstat() translates heap status codes obtained from
_heapchk(), _heapset(), and _heapwalk() into descriptive
strings.
e. void far *_relocate( void far *ptr );
_Relocate() repositions a heap entry as low as possible
in memory. It is the basis for heap compaction and
garbage collection as described in section II.B.1.
f. "Huge" versions of standard functions
The _hcalloc, _hexpand, _hfree, _hmalloc, _hmsize, and
_hrealloc functions handle "huge" heap entries (larger
than 64Kb) just as their '_f' counterparts handle "far"
heap entries (less than 64Kb).
4. Compiler options
These options affect applications that #include "heap.h".
Enable the special options via the compiler command line
with -D<symbol>=<value>.
a. STDMSC -- enforcing strict source code compatibility
This symbol controls the declaration of the symbols,
variables, and functions unique to the replacement heap
manager. If STDMSC is defined heap.h behaves just like
malloc.h. If STDMSC is not defined, the extra capabilites
of the replacement heap manager are available to
application programs.
b. _HEAPDEBUG -- debugging heap problems
This symbol enables an alternate set of functions that
diagnose heap problems. The functions that change the
heap are replace by equivalents that check the heap
consistency with _heapchk(), perform the standard heap
operation, and then record the transaction for diagnostic
reports triggered by errors.
The defined value of the symbol is taken to be the stream
on which error reports should appear. For example,
-D_HEAPDEBUG=stdout sends error reports to the standard
output stream.
Enabling _HEAPDEBUG also enables a stringent form of pointer
checking. Instead of a test on the value of the pointer
offset, _HEAPDEBUG checks that a matching entry exists in
the heap (this test is free as _HEAPDEBUG must traverse the
heap to check for errors anyway). Thus _HEAPDEBUG guarantees
that no pointer that happens to have a zero offset can masquerade
as a heap pointer.
STDMSC overrides _HEAPDEBUG. If both are defined _HEAPDEBUG
is ignored.
For further details see section IV.D.
c. _HEAPTRACE -- tracing heap activity
This symbol enables an alternate set of functions that
trace heap activity. Functions that change the heap are
replaced by equivalents that write a transaction record
to the trace log file after every heap access. The
transaction reports give the sequence number, the name
of the function, the value of the parameters, the source
file and line number and the returned result.
The defined value of the symbol is taken to be the stream
on which transaction reports should appear. For example,
-D_HEAPTRACE=stdout sends reports to the standard output file.
Enabling _HEAPTRACE implies that _HEAPDEBUG is enabled as well.
STDMSC overrides _HEAPTRACE. If both are defined _HEAPTRACE
is ignored.
For further details see section IV.E
C. Direct access to the heap data structures.
The heap manager is a layer between two interfaces. On the
bottom is the DOS memory manager that believes in arenas and
paragraph alignment, but handles blocks larger than 64 Kb
easily. On the top is the C run-time library definition that
believes in size_t, word alignment, and other limitations of
32-bit segmented addressing on a 16-bit machine with an 8-bit
oriented architecture.
The two interfaces are represented explicitly. On one hand
there are arenas that match the DOS requirements (typdef _ARENA).
The memory contained in an arena is managed with a list of chunks
where each chunk conforms to the requirements of the C interface
(typedef _CHUNK).
In order to support reallocation and expansion of free'd memory
blocks there is a separate list of chunks that have been free'd
that are potential targets for reuse. Prior to an allocation
request the entries in this list are added to the free list, and
adjacent free blocks are joined. (Thus the name join list from
to-be-joined). This list is usualy empty and when not empty it
is short so the overhead is tiny.
To "walk" the heap you traverse the list of arenas based on
_heaphead, and within each arena traverse the list of chunks based
on the arena header. The free and join lists may also be traveled
based on their headers _freelist and _joinlist, but this is not
actually necessary as all chunks, free or used, appear on their
respective arena-control lists.
D. The heap manager will handle non-DOS memory. If you provide a
block of memory with an initialized arena and chunk header the
heap manager will happily manage the it as part of the memory
pool. For more information see the get-new-arena portion of
_xalloc() in heaputil.c. Be sure to set your _ARENA.isdos to
FALSE to prevent _heappack() from trying to give it back to DOS.
This feature is especially useful on systems with non-DOS memory
such as EMS or XMS "upper" memory in the BIOS area from C0000 to
FFFFF, or video graphics memory in the A0000-AFFFF region.
The next version of the heap manager will provide functions to
add and delete non-DOS blocks of memory from the heap.
IV. Error handling and debugging
The replacement heap manager provides a heap that is harder to damage,
and easier to debug than the original heap manager. The following
sections describes the salient features.
A. Comprehensive error checking
The heap checking performed by the MSC _heapchk() is minimal. It
appears to be a simple limit test of heap pointers against the
boundaries of the entire heap area.
The replacement heap manager uses a detailed consistency check.
Every pointer is validated against the individual arena limits.
Further, the nodes in the linked lists are checked for reciprocity
(MSC cannot do this because it does not have a doubly-linked list).
The reciprocity test is an extermely powerful method of heap
validation. It is even possible to repair a damaged heap by
appropriate use of the linked lists. This facility is not
included in this library because its usefulness is questionable,
and it grossly exceeds the fuctionality of the specifications
(the MSC heap manager).
The consistency check also verifies that the safety margins set
by _heappad are intact, and that the read-only entries established
by _heapwatch have not been changed.
B. User controlled allocation padding for debugging.
One classic problem with heap management is writing past the end
of a buffer obtained from the heap. This typically produces
extremely hostile effects. As the routines that handle
heap-generated buffers are usually embedded in the lower layers
of software, checking them can be tedious.
One method of testing this kind of problem is to add extra space
at the ends of the allocated areas. If the failures stop, there
is evidence of a heap problem. Of course, changing sizes moves
things around which can turn errors on and off randomly so the
evidence is flimsy. The problem with this debugging technique
is that changing the buffer lengths manually is a time-consuming,
error-prone operation with marginal utility.
The replacement heap manager provides a control for automatically
extending all heap allocation requests. Thus an initialization
parameter is enough to provide evidence for or against heap
problems. Typically the padding is set to one or two times the
size of the structures being manipulated.
Because the padding affects the distance between the data area
and the heap control structures it cannot be changed after the
heap has been initialized. The heap manager takes a copy of
_heappad when the heap is initialized and uses the copy thereafter.
Thus _heappad must be set as early as possible, typically in main().
Note that command-line wild cards processed with setargv.obj use
the heap before main is started. Normal programs without wild card
expansion can use:
extern size_t _heappad; /* declare the global variable */
int main()
{
_heappad = 100; /* set the value */
}
But programs with wild card handling enabled (those linked with the
special version of setargv.obj) must use:
size_t _heappad = 100; /* define the global and its value */
int main()
{
}
This sets the value at compile/link time rather than at run time.
The safety margins are automatically filled when a heap entry
is allocated. The fill pattern is initially alternating bits,
0x55, but it is reset by every call to _fheapset(). _Fheapchk()
compares the two safety margins and reports _HEAPBADPTR if they
differ.
C. Heap pointers are distinctive and are checked for validity.
All heap pointers are aligned on paragraph boundaries, so that
a non-zero offset indicates an invalid pointer. Since all of
the functions receiving pointers as arguments check for invalid
pointers it is hard to corrupt the heap with a non-heap pointer.
When an invalid pointer is detected the functions write an error
messaage to stderr and exit(-1). To change this behavior adjust
the _fptr_chk function in heaputil.c
Note that the paragraph alignment principle imples that a
protected-mode equivalent of the replacement heap manager is
feasible, with each heap entry having it's own selector value
and (potentially) a individual set of permissions.
D. Debugging with _HEAPDEBUG
The symbol _HEAPDEBUG enables an alternate set of heap functions
that aid in debugging. The debug functions keep track of every
heap entry they process, recording the function name, desired size,
source file name and line number responsible for the entry, plus a
transaction (sequence) number. The debug functions also check the
heap status prior to every heap access and print a diagnostic report
if _heapchk() reports a problem. The value of _HEAPDEBUG is taken
to be the stream on which diagnostic error reports should appear.
Enabling _HEAPDEBUG also enables a stringent form of pointer
validity checking. Instead of a test on the value of the pointer
offset, _HEAPDEBUG checks that a matching entry exists in the heap
(this comes free because _HEAPDEBUG must traverse the heap anyway
to check for errors). Thus _HEAPDEBUG guarantees that no pointer
that happens to have a zero offset can masquerade as a heap pointer.
When a debug function detects an error it writes a diagnostic
report to the indicated stream file and exits. The report includes
the following information:
- the function detecting the error (function name, source
file, and source line number)
- the heap information on the offending entry (actual size,
address, and used/free status)
- the historical information on the entry (desired size,
creator function, source file name and line number)
- a dump of the heap as described in section III.B.3.a
including all of the debugging information
Debugging with an appropriate (non-zero) value in _heappad
increases the chances of detecting a wild pointer or a pointer
running past an end of a heap entry.
The debugging capability of the replacement heap manager features
the following benefits:
- Mapping multiple heap entries to each debug record
permits debugging of very large heaps by minimizing
the memory overhead of debug mode.
- Variable safety margin (_heappad) allows you to match
the size of application data structures.
- No source code changes are necessary to activate the
debugging capability, only recompilation with the
-D_HEAPDEBUG=<stream> definition.
The file dbugdemo.c illustrates the basic techniques.
E. Debugging with _HEAPTRACE
The symbol _HEAPTRACE enables an alternate set of heap functions
that aid in debugging. The debug functions report every heap
transaction, including the sequence number, the name of the
function, the value of the parameters, the source file and line
number, and the returned result. The transaction reports are
sent to the stream defined by -D_HEAPTRACE=<stream>.
Note that the heap transaction sequence number is only a rough
guide to the chronology of events. First, heap operations in
modules compiled without _HEAPTRACE or _HEAPDEBUG do not affect
the sequence number so there may be hidden transactions. Second,
when the transaction is saved in a debug record the sequence
number is stored as well. Because all heap entries related to
an individual function call (and of identical size) share a
debug record they will all appear to have the sequence number
of the most recent member.
Enabling _HEAPTRACE also enables _HEAPDEBUG and _HEAPCHECK.
F. Monitoring heap data
The _heapwatch() function establishes the read-only status of
heap entries. Entries marked read-only are tested by the
_heapchk/set/walk() family of functions, and also tested by
the debugger when _HEAPDEBUG or _HEAPTRACE are defined.
Read-only heap entries may be modified, and their read-only
status preserved by re-invoking _heapwatch() on the affected
heap entry.
V. Compatibility with MSC -- only 99.44%
The replacement heap manager is as close as possible to full
compatibility with the original MSC heap manager, but, inevitably,
there are a few differences.
A. STDMSC
The symbol STDMSC is available if strict source-code compatibility
with MSC is required. Applications compiled with -DSTDMSC are
guaranteed to compile with MSC's malloc.h instead of heap.h.
B. Compromises
Functionally, the replacement heap manager was to be identical
to the original MSC heap manager, but a few compromises intervened.
1. The replacement heap manager is not limited to allocations
of 65516 bytes or less. It will supply "far" blocks up to
65535 (actually 65536) bytes and "huge" blocks of up to 1Mb.
size. The MSC restriction of 65516 bytes apparently derives
from MSC's lack of support for large pointers. It uses 16-bit
operations on 32-bit pointers, which means that a region based
at 1234:000E can only extend to 1234:FFFF. Thus it loses up
to 14 bytes due to lack of paragraph alignment. Another six
bytes is lost to the heap control block associated with each
heap entry.
The replacement heap manager uses paragraph alignment and does
not count its heap control blocks against the segment limit.
Thus it is able to deliver a full segment of 64 Kb. (See
SIZE_MAX described in section III.B.1).
2. The replacement heap manager is larger than the equivalent
MSC routines. The difference is several KB depending on which
functions are used.
C. Address models
The replacement heap manager handles the "far" heap. The "near"
heap is handled by the standard MSC heap manager. In "small"
and "medium" programs, which use "near" as the default data
address model, the functions calloc(), free(), malloc(), _msize(),
and realloc() use the "near" heap and are not superceeded by the
replacement heap library. In "compact", "large", and "huge"
programs, which use "far" as the default data address model,
the functions calloc(), free(), malloc(), and realloc() use the
"far" heap and are superceeded by the replacement heap library.
Note that in "far" address model programs the replacement heap
manager automatically recovers unused space from the "near" heap
when the "far" heap is first accessed. This process is described
in section II.B.
VI. Versions (in reverse order)
6. Released 15-Mar-90
a. Bug fixes
i. Format of _fheapdump()
The format of the _fheapdump report has been adjusted so that
it makes sense in all situations. Perviously, some fields were
misplaced.
ii. Safety margins on _expand'd blocks
If _heappad is non-zero, and debugging is enabled, and a block
is adjusted with _expand or realloc without requiring repositioning
the previous version would report a spurious error because the
safety margin was not properly adjusted unless the data was moved.
iii. Tracing the "near" heap with _HEAPTRACE
The tracing routines for the "near" heap were garbled because
they had "far" pointer formats. The formats are now model
dependent.
b. Changes
i. Interaction of _heappad and SIZE_MAX
The value of _heappad reduces the size of the maximum allocation
supported by the "far" versions of the heap allocation routines.
The functions affected are _fcalloc, _fexpand, _fmalloc, _fmsize,
_frealloc, and, in programs compiled with "far" data, calloc,
_expand, malloc, _msize, and realloc.
This adjustment avoids a possible segment wrap-around due to
the partial pointer arithmetic used by the MSC compiler. If
you set _heappad to 100 and malloc(65500) the prior versions
would grant the request and return a pointer with an offset
of 100. If the application program attempts to access the
entire data region it will exceed the capacity of the 16-bit
offset, and the pointer will wrap around to the beginning of
the segment.
Note that this problem cannot be solved by switching to the
"huge" address model because the MSC compiler does not support
"huge" pointers, only "huge" arrays. In fact, there is no way
to get the compiler to perform pointer arithmetic properly.
c. New features
None.
5. Released 10-Mar-90
a. Bug fixes
None.
b. Changes
i. All heap access functions are available in "near", "far", and
"huge" flavors in addition to the default. The model-specific
versions are all prefixed with '_n', '_f', and '_h' respectively.
This lets a "small" model program use _frealloc() on the "far"
heap, a capability missing from the original library.
The original functions that do not follow this convention are
still available from the original library (halloc, hfree, _freect,
_memmax, and _memavl).
ii. All functions are written in C so that users without an
assembler may recompile the library at will.
c. New features
i. The _relocate() function is now available in "near" and "far"
models. The _nrelocate() version uses _nheapwalk() to find
relocation targets. The _frelocate() version uses the free
list to find relocation targets.
ii. Debugging and tracing now covers the "near" heap as well as
the "far" heap. However, the debugger is currently unable to
record historical information on "near" heap entries. It
does detect and report errors however.
4. Released 3-Mar-90.
a. Bug fixes
i. Hfree.asm was defining _hfree() rather than hfree. The
typo has been corrected.
ii. _heapwalk() contained a bug related to multiple arenas
that has been fixed.
b. Changes
i. _Fheapset() has a side effect of resetting the default fill
value for the safety margins.
ii. The heap manager ignores changes to _heappad after heap
initialization. It uses a snapshot taken at init time
to avoid variation in the offset from the control structures
to the application data regions.
c. New features
i. The function _heapstat() is available to convert _heapxxx()
status codes to appropriate text strings.
ii. The compiler switch -D_HEAPDEBUG=<stream> where <stream> is
stdout, stderr, or an application error log file enables
special debugging routines. Functions that modify the
heap first check the heap for errors. They also record
the transaction for later reference. When an error is
detected a diagnostic report is written to the idicated
output file For more information see section IV.D.
iii. The compiler switch -D_HEAPTRACE=<stream> enables heap
function tracing to the indicated stream file. For
more information see section IV.E.
iv. Heap.h is now a proper replacement for malloc.h. The library
replaces the far heap in all address models (including "small"
and "medium").
3. Released 15-Feb-90
a. Bug fixes
i. _Heapwalk() is now fully MSC compatible.
b. Changes
i. _Relocate() can handle "huge" heap entries. _Relocate()
ignores _heapmode and always positions the entry as low
as possible in memory to maximize the effect of _heappack().
ii. Both ends of the heap entries get safety margins _heappad
bytes long. The safety margins are automatically filled
with a distinctive bit pattern (0x55) when the region is
allocated.
iii. Chunk headers are now adjacent to the data region. This
reduces the control overhead from 16 to 8 bytes per block,
and removes the need for _HEAPALIGN variants.
c. New features
i. An alternate method of handling invalid pointers was added.
Normally,the functions that accept pointer parameters check
the pointers for validity and silently ignore erroneous
invocations. Applications compiled with -DPTRCHK use
alternate functions which report errors on stderr. The error
reports include name of the offending function, the value of
the invalid pointer, and the source file name and line number.
2. With _HEAPALIGN set TRUE the unused area of the header paragraph
provides an automatic front-end pad like _heappad provides for the
other end of the chunk. However, the length is fixed (6 bytes).
In version 2 _heapset() affects the front-end pad bytes for all
chunks as well as the data areas of free chunks.
The _fheapxxx() functions were jumping to heapxxx() rather than
_heapxxx(). The typo has been corrected
Halloc(), hfree(), and _hsize() are now supported. The pointers
from halloc() are compatible with the pointers from malloc(). This
similarity removes one of the major problems of the MSC heap
manager: pointer disposal. In a mixed environment containing both
far and huge heap entries, the MSC heap manager requires that the
application program track the source of each entry so that the
proper function may be used to dispose of the entry when processing
is complete. There is no such requirement with the replacement heap
manager. Free() will handle any heap pointer correctly.
1. First release
VII. Licensing, distribution, and conditions of use
A. This software and the associated documentation is copyright (c)
1990 Optimal Sofware, All Rights Reserved.
B. This software is provided as a service of Optimal Software.
There are no royalties or fees associated with its use. You
may distribute programs linked with this software without
restriction.
At the urging of several users of this software we have defined
the following guidelines for those wishing to make financial
contributions toward future software development:
Individuals may contribute $25, Students may contribute $15
Organizations may contribute $25 per program linked with the
replacement library, or $100 for an unlimited number of
programs. (This is not a per-copy count. There may be any
number of copies of a single program.)
Commercial software vendors may contribute $100 per product
linked with the replacement library, or $400 for an unlimited
number of products.
Commercial vendors of language products (compilers, assemblers,
interpreters, and run-time libraries) may use the replacement
heap manager in their program products, as described under
commercial software vendors, but may not bundle, merge, or
package it with any product without a written license from
Optimal Software.
All contributors will receive updated copies of the replacement
heap manager.
We suggest that all users utilize the replacement heap manager as
a matter of course, without regard for contributions. If it turns
out that it improves the quality or reliability of your programs,
or if it saves you time and effort, you may consider making a
financial contribution.
If you like the replacement heap manager so much that the above
guidelines are not appropriate, contact Microsoft (R) and tell them
about your requirements for quality (optimal?) software. The address
is: Microsoft, C Product Manager, PO BOX 97017, Redmond, WA 98073-9917.
Microsoft is a registered trademark of Microsoft Corporation.
C. Source code for the replacement heap manager is available. Contact
Optimal Software directly for more information. Educational
institutions may qualify for a free source license.
D. There is a restriction on the sale or distribution-for-profit of
this software. A written license from Optimal Software is required
to sell the software in source or object (unlinked) form.
No license is required if the software is distributed for free.
However, the entire package must be copied, including this
documentation.
E. Naturally we are interested in suggestions and improvements. If
you have any please let us know.
Lee Winter
Optimal Software
4 Lacy Lane
Nashua, NH 03061-2151
603-880-9844
CIS [73710,406]